Now we are in the second exercise session. Non-flat files were promised, so please load the GESIS Panel COVID-19 survey data.
This first part of the exercises only deals with importing data. Later, in the second part, we will turn to labelling and exporting.
haven package.
library(haven)
gp_covid <-
read_spss("./data/ZA5667_v1-1-0.sav")
In contrast to the flat files, such as CSV, the variables now have labels.
sjlabelled::get_label(your_data), but you have to make sure only to print the first ten variables.
library(sjlabelled)
get_label(gp_covid[1:10])
## za_number version doi
## "Studiennummer des Archivs" "Versionskennung und -datum des Archivs" "Digital Object Identifier (doi)"
## id cohort sex
## "Befragten-ID" "Rekrutierungskohorte" "Geschlecht"
## age_cat education_cat intention_to_vote
## "Alter, kategorisiert" "Bildung, kategorisiert" "Sonntagsfrage (gbzc011a)"
## choice_of_party
## "Sonntagsfrage Wahlentscheidung"
Unfortunately, it’s all in German. Imagine you are an education researcher, and you are interested in the variable education_cat. So you may want to consider translating the variable into English.
education_cat from “Bildung, kategorisiert” to “Education, categorized”.
sjlabelled::set_label() or do it in a pipe with sjlabelled::var_labels().
# either
gp_covid$education_cat <-
set_label(
gp_covid$education_cat,
label = "Education, categorized"
)
# or
library(dplyr)
gp_covid <-
gp_covid %>%
var_labels(
education_cat = "Education, categorized"
)
# proof
get_label(gp_covid$education_cat)
## [1] "Education, categorized"
Your colleague asks you to provide your new data after changing labels and stuff. Unfortunately, she does not use R or SPSS and asks you to export your data as a Stata file.
haven package.
write_stata(gp_covid, "gesis_panel_corona_fancy_panels_final_final.dta")